System and program product for providing high performance data lookup

ABSTRACT

Under the present invention, index keys are generated for a set of documents. This is typically accomplished by examining the set of documents, and connecting data values extracted from the set of documents to yield the index keys. Once the index keys are generated, an index view will be generated into which the index keys are populated. Using the index keys in the index view, an agent will automatically obtain the set of documents (i.e., in the background). Then, when a user requests one of the documents, the document will already have been retrieved from storage. As such, it can readily be provided to the user. It should be understood that as used herein, the term “document” is intended to refer to any type of electronically stored data.

The current application is a continuation application of co-pending U.S.patent application Ser. No. 11/095,997, filed on Mar. 31, 2005, which ishereby incorporated by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention generally relates to data lookup. Specifically,the present invention relates to a system and program product forproviding high performance data lookup (e.g., document retrieval).

2. Related Art

As the use of information technology (IT) continues to increase, agrowing number of organizations are turning to IT-based solutions fortheir data storage needs. For example, today an organization can store acountless number of “documents” electronically while consuming verylittle physical space. Such an IT-based approach can not only saveoverhead costs, but also allows for improved redundancy. Moreover, whenstoring documents electronically, computerized access can be providedfor authorized individuals from virtually any location.

Unfortunately, electronic document storage has various drawbacks. Forexample, in order to provide efficient access to electronic documents,they must be indexed in some manner. Moreover, requests for documentsmust be handled correctly. Due to the manner in which the documents canbe stored, there is often a latency involved with their retrieval.

In view of the foregoing, there exists a need for a method, system andprogram product for providing high performance data lookup.Specifically, a need exists for a methodology and a “view” in whichstored documents can be indexed for rapid retrieval.

SUMMARY OF THE INVENTION

In general, the present invention provides a method, system and programproduct for providing high performance data lookup. Under the presentinvention, index keys are generated for a set of documents. This istypically accomplished by examining the set of documents, and connectingdata values extracted from the set of documents to yield the index keys.Once the index keys are generated, an index view will be generated intowhich the index keys are populated. Using the index keys in the indexview, an agent will automatically obtain the set of documents (i.e., inthe background). Then, when a user requests one of the documents, thedocument will already have been retrieved from storage. As such, it canreadily be provided to the user.

A first aspect of the present invention provides a method for providinghigh performance data lookup, comprising: extracting data values fromeach of a set of documents; creating index keys for the set of documentsusing the extracted data values; populating the index keys into an indexview; and automatically obtaining the set of documents using the indexkeys in the index view.

A second aspect of the present invention provides a method for providinghigh performance data lookup, comprising: generating index keys for aset of documents; populating an index view with the index keys;automatically obtaining the set of documents using the index keys;receiving a request for a desired document; and retrieving the desireddocument from the obtained set of documents based on the request and theindex keys.

A third aspect of the present invention provides a system for providinghigh performance data lookup, comprising: means for generating indexkeys for a set of documents; means for populating an index view with theindex keys; means for automatically obtaining the set of documents usingthe index keys; means for receiving a request for a desired document;and means for retrieving the desired document from the obtained set ofdocuments based on the request and the index keys.

A fourth aspect of the present invention provides a program productstored on a computer readable medium for providing high performance datalookup, the computer readable medium comprising program code forperforming the following steps: generating index keys for a set ofdocuments; populating an index view with the index keys; automaticallyobtaining the set of documents using the index keys; receiving a requestfor a desired document; and retrieving the desired document from theobtained set of documents based on the request and the index keys.

A fifth aspect of the present invention provides a method for deployingan application for providing high performance data lookup, comprising:providing a computer infrastructure being operable to: generate indexkeys for a set of documents; populate an index view with the index keys;automatically obtain the set of documents using the index keys; receivea request for a desired document; and retrieve the desired document fromthe obtained set of documents based on the request and the index keys.

A sixth aspect of the present invention provides computer software fordeploying an application for providing high performance data lookup, thecomputer software comprising instructions for causing a computer systemto perform the following functions: generate index keys for a set ofdocuments; populate an index view with the index keys; automaticallyobtain the set of documents using the index keys; receive a request fora desired document; and retrieve the desired document from the obtainedset of documents based on the request and the index keys.

A seventh aspect of the present invention provides a view for indexingdocuments, comprising: an index key portion for storing index keys for aset of documents, wherein each of the index keys includes a plurality ofdata values extracted from a corresponding one of the set of documents,and wherein the plurality of data values for the index keys areseparated by a connector.

An eighth aspect of the invention provides a computer-readable mediumthat includes computer program code to enable a computer infrastructureto provide high performance data lookup.

A ninth aspect of the invention provides a business method for providinghigh performance data lookup.

The illustrative aspects of the present invention are designed to solvethe problems herein described and other problems not discussed, whichare discoverable by a skilled artisan.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an illustrative system for providing high performance datalookup according to the present invention.

FIG. 2 shows an illustrative an index view according to the presentinvention.

FIG. 3 shows an illustrative method flow diagram according to thepresent invention.

It is noted that the drawings of the invention are not to scale. Thedrawings are intended to depict only typical aspects of the invention,and therefore should not be considered as limiting the scope of theinvention. In the drawings, like numbering represents like elementsbetween the drawings.

DETAILED DESCRIPTION OF THE DRAWINGS

As indicated above, the present invention provides a method, system andprogram product for providing high performance data lookup. Under thepresent invention, index keys are generated for a set of documents. Thisis typically accomplished by examining the set of documents, andconnecting data values extracted from the set of documents to yield theindex keys. Once the index keys are generated, an index view will begenerated into which the index keys are populated. Using the index keysin the index view, an agent will automatically obtain the set ofdocuments (i.e., in the background). Then, when a user requests one ofthe documents, the document will already have been retrieved fromstorage. As such, it can readily be provided to the user. It should beunderstood that as used herein, the term “document” is intended to referto any collection of data that is store electronically.

Referring now to FIG. 1, a system 10 for providing high performance datalookup according to the present invention is shown. As depicted, system10 includes a computer infrastructure 12, which comprises a computersystem 14 that can perform the various process steps described herein.Computer system 14 is intended to represent any type of computer systemcapable of carrying out the teachings of the present invention. Forexample, computer system 14 could be a laptop computer, a desktopcomputer, a workstation, a handheld device, a server, etc. In addition,as will be further described below, computer system 14 can be deployedand/or operated by a service provider that is building providing highperformance data lookup for users such as user 16. It should beappreciated that user 16 could directly access computer system 14 asshown, or could operate their own independent computer systems thatcommunicate with computer system 14 over a network (e.g., the Internet,a wide area network (WAN), a local area network (LAN), a virtual privatenetwork (VPN), etc.). In the case of the latter, communications betweencomputer system 14 and the user-operated computer system can occur viaany combination of various types of communications links. For example,the communication links can comprise addressable connections that mayutilize any combination of wired and/or wireless transmission methods.Where communications occur via the Internet, connectivity could beprovided by conventional TCP/IP sockets-based protocol, and an Internetservice provider could be used to establish connectivity to theInternet.

In any event, assume that user 16 is authorized to access documents 50as maintained by organization 18. Under the present invention, highperformance data lookup of documents 50 is provided. To provide thisfunctionality, performance lookup system 40 is shown implemented oncomputer system 14 as computer program code. To this extent, computersystem 14 is shown including a processing unit 20, a memory 22, a bus24, and input/output (I/O) interfaces 26. Further, computer system 14 isshown in communication with external I/O devices/resources 28 and one ormore storage systems 30. In general, processing unit 20 executescomputer program code, such as performance lookup system 40, that isstored in memory 22 and/or storage system(s) 30. While executingcomputer program code, processing unit 20 can read and/or write data,to/from memory 22, storage system(s) 30, and/or I/O interfaces 26. Bus24 provides a communication link between each of the components incomputer system 14. External devices 28 can comprise any devices (e.g.,keyboard, pointing device, display, etc.) that enables a user tointeract with computer system 14 and/or any devices (e.g., network card,modem, etc.) that enables computer system 14 to communicate with one ormore other computing devices, such as those in organization 18 and/oroperated by user 16.

Computer infrastructure 12 is only illustrative of various types ofcomputer infrastructures for implementing the invention. For example, inone embodiment, computer infrastructure 12 comprises two or morecomputing devices (e.g., a server cluster) that communicate over anetwork to perform the various process steps of the invention. Moreover,computer system 14 is only representative of various possible computerinfrastructures that can include numerous combinations of hardware. Tothis extent, in other embodiments, computer system 14 can comprise anyspecific purpose computing article of manufacture comprising hardwareand/or computer program code for performing specific functions, anycomputing article of manufacture that comprises a combination ofspecific purpose and general purpose hardware/software, or the like. Ineach case, the program code and hardware can be created using standardprogramming and engineering techniques, respectively. In addition,processing unit 20 may comprise a single processing unit, or bedistributed across one or more processing units in one or morelocations, e.g., on a client and server. Similarly, memory 22 and/orstorage system 30 can comprise any combination of various types of datastorage and/or transmission media that reside at one or more physicallocations. Further, I/O interfaces 26 can comprise any system forexchanging information with one or more external devices 28. Stillfurther, it is understood that one or more additional components (e.g.,system software, math co-processing unit, etc.) not shown in FIG. 1 canbe included in computer system 14. However, if computer system 14comprises a handheld device or the like, it is understood that one ormore external devices 28 (e.g., a display) and/or storage system(s) 30could be contained within computer system 14, not externally as shown.

Storage system 30 can be any type of system (e.g., a database) capableof providing storage for information under the present invention. Tothis extent, storage system 30 could include one or more storagedevices, such as a magnetic disk drive or an optical disk drive. Inanother embodiment, storage system 30 includes data distributed across,for example, a local area network (LAN), wide area network (WAN) or astorage area network (SAN) (not shown). Although not shown, additionalcomponents, such as cache memory, communication systems, systemsoftware, etc., may be incorporated into computer system 14. Moreover,although not shown for brevity purposes, and computer systems operatedby user 16 will likely contain computerized components similar tocomputer system 14. It should also be understood that organization 18and documents 50 could be contained within infrastructure 12. They areshown as independent systems for illustrative purposes only.

Shown in memory 22 of computer system 14 is performance lookup system40, which includes index key system 42, index view system 44, documentretrieval system 46 and request processing system 48. Operation of eachof these systems is discussed further below. However, it is understoodthat some of the various systems shown in FIG. 1 can be implementedindependently, combined, and/or stored in memory for one or moreseparate computers systems 14 that communicate over a network. Further,it is understood that some of the systems/functionality may not beimplemented and/or additional systems/functionality may be included aspart of the present invention. Still yet, it is understood that thedepiction of these systems shown in FIG. 1 is illustrative only and thatthe same functionality could be achieved with a different configuration.That is, the functionality of these systems could be combined into fewersystems, or broken down into additional systems.

Under the present invention high performance lookup of documents 50 isprovided. First, index key system 42 will create an index key for eachdocument. To create the index keys, index key system 42 will analyzedocuments 50 and extract data values therefrom. These data values willbe connected and separated by a separator to yield strings of data (witheach string corresponding to a particular document). In addition, aswill be shown below in conjunction with FIG. 2, the data values arepositioned in the index keys in a descending hierarchical fashion.

Once the index keys have been generated, index view system 44 willgenerate an index view into which the index keys are populated.Referring now to FIG. 2, index view 60 is shown in greater detail. Asdepicted, index view 60 includes a key window 62 where index keys 64 arelisted. The index keys 64 shown each include multiple data values asextracted from a corresponding document. In a typical embodiment, eachindex key 64 includes five or more data values. As further shown, eachdata value is separated from the next by a separator such as a tilde(˜). In addition, as indicated above, the data values are arrangedwithin each index key in a descending hierarchical fashion (e.g., year2004 is first, month 04 is second, etc.). Document type window 66 allowsa specific type of document 68 to be selected for display of itscorresponding index keys in key window 62.

Referring back to FIG. 1, once index view 60 has been populated,document retrieval system 46 will automatically retrieve the documentsusing their index keys 64 (FIG. 2). Specifically, under the presentinvention, document retrieval system 46 includes an automated agent orthe like that analyzes the index keys 64, and obtains the correspondingdocuments 50. At that point, the documents 50 can be considered “local”to computer system 14 (e.g., in memory 22 or storage system 30).

Then, if user 16 requests a certain document, user 16 will issue arequest via a user view that is received by request processing system48. Upon receipt, request processing system 48 will parse the request todetermine what document is being requested, and then retrieve thatdocument by cross-referencing the index key for that document. In oneembodiment, request processing system can analyze the requests, andgenerate a user key for the requested document. The user key canresemble or be similar to the index key for that document. To thisextent, request processing system 48 can be configured similar to indexkey system 42. A user key that is determined to be identical orsufficiently similar to an index key upon comparison could correspond tothe requested document. Such document would then be retrieved (e.g.,from local storage of computer system 14) and returned to or displayedto user 16.

Referring now to FIG. 3, a method flow diagram 100 according to thepresent invention is shown. First step S1 is to generate index keys fora set of documents. In general, this involves examining the set ofdocuments, and connecting data values extracted from the set ofdocuments to yield the index strings. Second step S2 is to populate anindex view with the index keys. Thereafter, step S3 is to automaticallyobtain the set of documents using the index keys. Fourth step S4 is toreceive a request for a desired document and fifth step S5 is toretrieve the desired document from the obtained set of documents basedon the request and the index keys.

While shown and described herein as a method and system for providinghigh performance data lookup, it is understood that the inventionfurther provides various alternative embodiments. For example, in oneembodiment, the invention provides a computer-readable medium thatincludes computer program code to enable a computer infrastructure toprovide high performance data lookup within organizations. To thisextent, the computer-readable medium includes program code thatimplements each of the various process steps of the invention. It isunderstood that the term “computer-readable medium” comprises one ormore of any type of physical embodiment of the program code. Inparticular, the computer-readable medium can comprise program codeembodied on one or more portable storage articles of manufacture (e.g.,a compact disc, a magnetic disk, a tape, etc.), on one or more datastorage portions of a computing device, such as memory 22 (FIG. 1)and/or storage system 30 (FIG. 1) (e.g., a fixed disk, a read-onlymemory, a random access memory, a cache memory, etc.).

In another embodiment, the invention provides a business method thatperforms the process steps of the invention on a subscription,advertising, and/or fee basis. That is, a service provider, such as anInternet Service Provider, could offer to provide high performance datalookup as described above. In this case, the service provider cancreate, maintain, support, etc., a computer infrastructure, such ascomputer infrastructure 12 (FIG. 1) that performs the process steps ofthe invention for one or more customers. In return, the service providercan receive payment from the customer(s) under a subscription and/or feeagreement and/or the service provider can receive payment from the saleof advertising content to one or more third parties.

In still another embodiment, the invention provides a method ofproviding high performance data lookup. In this case, a computerinfrastructure, such as computer infrastructure 12 (FIG. 1), can beprovided and one or more systems for performing the process steps of theinvention can be obtained (e.g., created, purchased, used, modified,etc.) and deployed to the computer infrastructure. To this extent, thedeployment of a system can comprise one or more of (1) installingprogram code on a computing device, such as computer system 14 (FIG. 1),from a computer-readable medium; (2) adding one or more computingdevices to the computer infrastructure; and (3) incorporating and/ormodifying one or more existing systems of the computer infrastructure toenable the computer infrastructure to perform the process steps of theinvention.

As used herein, it is understood that the terms “program code” and“computer program code” are synonymous and mean any expression, in anylanguage, code or notation, of a set of instructions intended to cause acomputing device having an information processing capability to performa particular function either directly or after either or both of thefollowing: (a) conversion to another language, code or notation; and/or(b) reproduction in a different material form. To this extent, programcode can be embodied as one or more of: an application/software program,component software/a library of functions, an operating system, a basicI/O system/driver for a particular computing and/or I/O device, and thelike.

The foregoing description of various aspects of the invention has beenpresented for purposes of illustration and description. It is notintended to be exhaustive or to limit the invention to the precise formdisclosed, and obviously, many modifications and variations arepossible. Such modifications and variations that may be apparent to aperson skilled in the art are intended to be included within the scopeof the invention as defined by the accompanying claims.

1. A system for providing high performance data lookup, comprising:means for generating index keys for a set of documents; means forpopulating an index view with the index keys; means for automaticallyobtaining the set of documents using the index keys prior to a requestfor a desired document from a user; means for receiving a request forthe desired document; and means for retrieving the desired document fromthe obtained set of documents based on the request and the index keys.2. The system of claim 1, wherein the means for generating the indexkeys, comprises: means for extracting data values from the set ofdocuments; and means for connecting data values to yield the indexstrings, wherein each of the index keys comprises a plurality of datavalues obtained from a corresponding one of the set of documents.
 3. Thesystem of claim 2, wherein the plurality of data values comprises atleast five data values.
 4. The system of claim 1, wherein the obtainingstep is performed by an automated agent.
 5. The system of claim 1,wherein the request is made by the user via a user view.
 6. The systemof claim 1, further comprising: means for generating a user key based onthe request; and means for comparing the user key to the index keys toidentify the desired document among the obtained set of documents.
 7. Acomputer program product having a computer readable medium for providinghigh performance data lookup, the computer program product comprising:program code stored on a computer readable medium, which when executedwould cause the computer to: generate index keys for a set of documents;populate an index view with the index keys; automatically obtain the setof documents using the index keys prior to a request for a desireddocument by a user; receive the request for the desired document; andretrieve the desired document from the obtained set of documents basedon the request and the index keys.
 8. The program product of claim 7,wherein the computer program product further comprises program codestored on a computer readable medium, which when executed would causethe computer to: extract data values from the set of documents; andconnect data values to yield the index strings, wherein each of theindex keys comprises a plurality of data values obtained from acorresponding one of the set of documents.
 9. The program product ofclaim 8, wherein the plurality of data values comprises at least fivedata values.
 10. The program product of claim 7, wherein the obtainingstep is performed by an automated agent.
 11. The program product ofclaim 7, wherein the request is made by the user via a user view. 12.The program product of claim 7, wherein the computer program productfurther comprises program code stored on a computer readable medium,which when executed would cause the computer to: generate a user keybased on the request; and compare the user key to the index keys toidentify the desired document among the obtained set of documents.